Amplifying the Impact of Open Access: Wikipedia and the Diffusion of Science

نویسندگان

  • Misha Teplitskiy
  • Grace Lu
  • Eamon Duede
چکیده

With the rise of Wikipedia as a first-stop source for scientific knowledge, it is important to compare its representation of that knowledge to that of the academic literature. Here we identify the 250 most heavily used journals in each of 26 research fields (4,721 journals, 19.4M articles in total) indexed by the Scopus database, and test whether topic, academic status, and accessibility make articles from these journals more or less likely to be referenced on Wikipedia. We find that a journal’s academic status (impact factor) and accessibility (open access policy) both strongly increase the probability of its being referenced on Wikipedia. Controlling for field and impact factor, the odds that an open access journal is referenced on the English Wikipedia are 47% higher compared to paywall journals. One of the implications of this study is that a major consequence of open access policies is to significantly amplify the diffusion of science, through an intermediary like Wikipedia, to a broad audience. Word count: 7894 Introduction Wikipedia, one of the most visited websites in the world 1 , has become a destination for information of all kinds, including information about science (Heilman & West, 2015; Laurent & Vickers, 2009; Okoli, Mehdi, Mesgari, Nielsen, & Lanamäki, 2014; Spoerri, 2007). Given that so many people rely on Wikipedia for scientific information, it is important to ask whether and to what extent Wikipedia’s coverage of science is a balanced, high quality representation of the knowledge within the academic literature. One approach to asking this question involves looking at references used in Wikipedia articles. Wikipedia requires all claims to be substantiated by reliable references 2 , but what, in practice, are “reliable references?” An intuitive approach is to examine whether the sources Wikipedia editors use correspond to the sources scientists value most. In particular, within the scientific literature, a journal’s status is often associated, albeit problematically (Seglen, 1997), with its impact factor. If status within the academic literature is taken as a “gold standard,” Wikipedia’s failure to cite high impact journals of certain fields would constitute a failure of coverage (Samoilenko & Yasseri, 2014), while a high correspondence between journals’ impact factors and citations in Wikipedia would indicate that Wikipedia does indeed use reputable sources (P. Evans & Krauthammer, 2011; Nielsen, 2007; Shuai, Jiang, Liu, & Bollen, 2013). Yet high impact journals often require expensive subscriptions (Björk & Solomon, 2012). The costs are, in fact, so prohibitive that even Harvard University has urged its faculty to “resign from publications that keep articles behind paywalls” because the library “can no longer afford the price hikes imposed by many large journal publishers” (Sample, 2012). Consequently, much of the discussion of open access focuses on the consequences of open access for the scientific community (Van Noorden, 2013). A lively debate has arisen on the impact of open access on the scientific literature, with some studies showing a citation advantage (Eysenbach, 2006a, 2006b; Gargouri et al., 2010; “The Open Access Citation Advantage Service”) while other find none (Davis, Lewenstein, Simon, Booth, & Connolly, 2008; Davis, 2011; Gaulé & Maystre, 2011; Moed, 2007). Apart from a rather unclear impact on the scientific literature, open access journals may have a tremendous impact on the diffusion of scientific knowledge beyond this literature. To date, this potential of open access policies has been a matter chiefly of speculation (Heilman & West, 2015; Trench, 2008). Previous research has found that open access articles are downloaded from publishers’ websites more often and by more people than closed access articles (Davis, 2010, 2011), but it is currently unclear by whom, and to what extent open access affects the use of science by the general public (Davis & Walters, 2011). We hypothesize that Wikipedia, with more than 8.5 million page views per hour 3 , diffuses scientific knowledge to unprecedented distances and that diffusion of science through it may relate to accessibility in two ways. By referencing findings from paywall journals, Wikipedia distills and diffuses these findings to the general public. On the other hand, Wikipedia editors may be unable to access expensive paywall journals 4 , and consequently reference the easily accessible articles instead. For example, Luyt and Tan’s (Luyt & Tan, 2010) study found accessibility to drive the selection of references in a sample of Wikipedia’s history articles. In this case Wikipedia “amplifies” open access science by broadcasting its (already freely accessible) findings to millions. This “amplifier” effect may thus constitute one of the chief effects of open access. Correspondence between academic and Wikipedia statuses This article tests both the distillation and amplifier hypotheses by evaluating which references Wikipedia editors around the world use and do not use. In particular we study the correspondence between journals’ status within the scientific community (impact factor) and their accessibility (open access policy) with their status within Wikipedia (percent of a journal’s articles referenced in Wikipedia). It is important to note that an observed correspondence may be evinced by a variety of mechanisms besides the aforementioned accessibility. First, the status ordering of academic journals as measured with impact factors may have only a tenuous relationship with the importance and notability – considerations of special relevance to Wikipedia 5 – of the published research. Citations, and therefore impact factors, are in part a function of the research field (Seglen, 1997), and may be affected by factors as circumstantial as whether a paper’s title contains a colon (Jamali & Nikzad, 2011; Seglen, 1997). Second, the academic status ordering results from the objectives of millions of scientists and institutions, and may be irrelevant to the unique objectives of Wikipedia. Wikipedia’s key objective is to serve as an encyclopedia, not a medium through which scientists communicate original research 6 . Relative to the decentralization of the scientific literature, Wikipedia is governed by explicit, if flexible, policies and a hierarchical power structure (Butler, Joyce, & Pike, 2008; Shaw & Hill, 2014). Apart from a remark that review papers serve Wikipedia’s objectives better than primary research articles, Wikipedia’s referencing policies generally pass no judgment over which items within the scientific literature constitute “the best” evidence in support of a claim 7 . Wikipedia’s objectives and explicit, centrally accessible, policies differ from the decentralized decisions that produce status orderings within the scientific literature and do not imply that the two status orderings should correspond. Indeed, if editors are not scientists themselves they need not even be aware that journal impact factors exist 8 . On the other hand, despite the well-worn caveats, prestigious, high-impact journals may publish findings that are more important to both academics and Wikipedia’s audience. In fact, a Wikipedia editor’s expectation that the truly important research resides within high-impact journals may be enough to predispose them to want reference such journals. Second, little is known about editors of science-related articles (West, Weber, & Castillo, 2012); they may be professional scientists with access to these high-impact journals, resulting in both the motivation and opportunity to reference them. Previous research Wikipedia references and academic status The first large-scale study of Wikipedia’s scientific references was performed by Finn Arup Nielsen (Nielsen, 2007). Nielsen found that the number of Wikipedia references to the top 160 journals, extracted from the cite-journal citation templates, correlated modestly with the journal’s Journal Citation Reports impact factor. This implication that Wikipedia preferentially cites high impact journals is delicate in part because the data used in the study included only a subset of journals with references that appear in Wikipedia, not journals that were and were not referenced. It is possible, albeit unlikely, that an even larger number of prestigious journals, made invisible by the methodology, are never referenced on Wikipedia at all, weakening the correlation to an unknown degree, or that the referenced journals are simply those that publish the most articles (see Nielsen 2007: Fig. 1). Shuai, Jiang, Liu, and Bollen (2013) also found modest correlations when they investigated a possible correspondence between the academic rank of computer science papers, authors, and topics and their Wikipedia rank. The altmetrics movement has also explored Wikipedia as non-academic venue on which academic literature makes an impact (ALM, Fenner, & Lin, 2014; “altmetrics,” ; Priem, 2015). Evans and Krauthammer (P. Evans & Krauthammer, 2011) examined the use of Wikipedia as an alternative measure of the scholarly impact of biomedical research. The authors correlated scholarly metrics of biomedical articles, journals, and topics with Wikipedia citations and, in contrast to other studies, included in some of their analyses a random sample of journals never referenced on Wikipedia. The authors also recorded a journal’s open access policy but, unfortunately, do not appear to have used this information in analyses. Open access and the Web The rather voluminous literature on open access has focused primarily on effects on the academic literature 9 . There is some debate on the size and direction of open access effects. Some evidence demonstrates that open access articles gain a citation advantage (Eysenbach, 2006a, 2006b; Gargouri et al., 2010; “The Open Access Citation Advantage Service”), while other evidence shows no such effect (Davis et al., 2008; Davis, 2011; Gaulé & Maystre, 2011; Moed, 2007). Regardless of the impacts on scientists in developed nations, increased accessibility through open access does yield benefits to scientists from developing nations (Davis & Walters, 2011; J. A. Evans & Reimer, 2009). The promise of open access for disseminating scientific information to the world at large has gained much less attention (Davis & Walters, 2011; Trench, 2008; for an exception see Heilman & West, 2015). Yet, more and more of the world turns to the Web for scientific information. For instance, as early as 1999 a full 20% of American adults sought medical and science information online (Miller, 2001). What’s more, one who actively seeks such information within the academic literature will quickly discover that, despite the paywalls, many important and impactful research articles are made freely available by their authors or third parties (Björk, Laakso, Welling, & Paetau, 2014; Wren, 2005). This is to say nothing of the fact that science may also be disseminated through distillation of its findings into venues like Wikipedia or science-centric websites and blogs so that, here too, the impact of open access may be limited. While full texts of the most impactful literature are, at least nominally, behind a paywall (Björk & Solomon, 2012), do Wikipedia’s editors consult these texts? If they cite them in Wikipedia, have they consulted the full texts beyond a freely available abstract before referencing? If the academic literature is any guide, referenced material is sometimes consulted rather carelessly (Broadus, 1983; Rekdal, 2014). In short, the current understanding of the relationship between open access and the general public in the literature is limited at best (Davis & Walters, 2011). Shortcomings and our contribution In addition to the role of accessibility, a number of substantive and methodological shortcomings remain. First, it is unclear if professional scientists edit Wikipedia’s science articles. As we will show below, a preponderance of paywall references would suggest, albeit indirectly, this to be the case 10 . The scant existing evidence indicates that science articles are edited by people with general expertise, relative to the more narrow experts of popular culture articles (West et al., 2012). Second, most previous studies have completely ignored the articles that are never referenced on Wikipedia, thus sampling on the dependent variable. The only notable exception, (P. Evans & Krauthammer, 2011), treated the unreferenced articles outside the main analytic framework. While the framework treated (referenced) articles or journals as the unit of analysis, the unreferenced articles and journals were treated as a homogeneous

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...

متن کامل

Participation and Scientific Collaboration in Persian Wikipedia

Background and Aim: This research studies the effective participation and scientific collaboration in Persian Wikipedia, from 2003-2012.  Method: The library method has been used. Also, considering the objectives and the nature of subject, the research method is a descriptive-applied and during its implementation scientometric technique has been used. Excel and SPSS softwares have been used for...

متن کامل

The Transmission of Scientific Knowledge to Wikipedia

This paper compares the scientific literature used most often by scientists to the scientific literature referenced on the Englishlanguage Wikipedia. Previous studies have raised concerns that editors of science-related articles on Wikipedia are biased toward easily available sources and underrepresent particular scientific fields. Most often, these studies examine references on Wikipedia only ...

متن کامل

Comparative Study of Species Diversity in different Land Use Units of the Borana Lowlands, Southern Oromia, Ethiopia

Quantitative study of species diversity across different land use units and districts is important to document status of local plant biodiversity, to evaluate impact of management and for planning future management. This study aimed at determining impacts of land use units on species diversity and spatial distribution of species in two districts of Borana zone, Oromia, Ethiopia. Stratification ...

متن کامل

Promoting the Sense of Place Attachment through Enplaning the Meaning of Place in the Open Space of Aseman-e-Tabriz Residential Complex

The meaning of place and, consequently, the quality of relationship between people and places has become significant due to the current ever rising identity crises and the breakdown of meaning as a result of various phenomena such as globalization. Since the open space has lost its value and since the open space design is not considered in residential complexe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 68  شماره 

صفحات  -

تاریخ انتشار 2017